171 research outputs found

    Managing the Euclid Data Model

    Get PDF

    Managing the Euclid Data Model

    Get PDF
    The Euclid common data model is central in, and essential to, the Euclid science ground segment. It defines the format of all data exchanged between the pipelines and stored in the Euclid Archive, and ensures all components can communicate with each other. But with more than 25 active contributors, managing the data model has been a challenge. Care must be taken that changes in the XML of the data model do not break its Python, C++, or database bindings. We describe recent progress in tackling these problems. The former problem has been mitigated with a new data model validator tool run during continuous integration. The latter has partially been solved via git management rules. Both approaches have only been possible after the migration of SVN to git, allowing the introduction of modern tooling

    Managing the Euclid Data Model

    Get PDF

    Managing the Euclid Data Model

    Get PDF

    Organization of the Euclid Data Processing: Dealing with Complexity

    Get PDF
    The data processing development and operations for the Euclid mission (part of the ESA Cosmic Vision 2015-2025 Plan) is distributed within a Consortium composed of 14 countries and 1300+ persons: this imposes a high degree of complexity to the design and implementation of the data processing facilities. The focus of this paper is on the efforts to define an organisational structure capable of handling in manageable terms such a complexity

    The Euclid Archive Processing and Data Distribution Systems: A Distributed Infrastructure for Euclid and Associated Data

    Get PDF
    The Euclid Archive System is an ambitious information system, which sits at the heart of the Euclid Science Ground Segment. It is a joint development between the Euclid Consortium and the ESAC Science Data Centre. It encompases both Euclid data and the large volume of associated ground based data (e.g. KiDS, DES and LSST). The Euclid Science Ground Segment consists of the Euclid Science Operations Centre and ten national Science Data Centres. The large data volumes demand that data transfer is minimized and that the processing is taken to the data. This is supported by the Euclid Archive Data Processing System and the Euclid Archive Distributed Data System. The Data Processing System consists of a central metadata repository, which contains the information necessary to process any data item and full data lineage of any data product created. The Distributed Data System provides a cloud solution with a node at each of the national Science Data Centres, which controls data storage and transfer. It supports a large number of storage types, including POSIX, iRODS, gridftp and Xrootd. No limitations are placed on the storage implemented at an individual SDC. Further more, the user of the system needs no knowledge of where data is located. Jobs will be started at the most appropriate locations, or data transferred as necessary

    The Role of the Euclid Archive System in the Processing of Euclid and External Data

    Get PDF
    Euclid is an ESA M2 mission which will create a 15,000 square degrees space-based survey: the Euclid Archive System (EAS) is a core element of the Science Ground Segment (SGS) of Euclid. The EAS follows a data-centric approach to data processing, whereby the Data Processing System (DPS) is responsible for the centralized metadata storage and the Distributed Storage System (DSS) supports the distributed storage of data files. The EAS-DPS implements the Euclid Common Data model and along with the EAS-DSS provides numerous services for Euclid Consortium users and SGS subsystems. In addition, the EAS-DPS assists in the preparation of Euclid data releases which are copied to the third EAS subsystem, the ESA developed Science Archive System (SAS) where they become available to the wider astronomical community. The EAS-DPS implements the object-oriented Euclid Common Data Model using a relational DBMS for the storage. The EAS-DPS supports the tracing of the lineage of any data item in the system, provides services for the data quality assessment and the data processing orchestration. The EAS-DSS is a distributed storage system which is based on a set of storage nodes located in each of the ten Science Data Centers of the Euclid SGS. The storage nodes supports a wide range of solutions from local disk, using a unix filesystem, to iRODS nodes or Grid storage elements. In this paper the architectural design of EAS-DPS and EAS-DSS are reviewed: the interaction between them and tests of the already implemented components are described

    The Euclid Archive Processing and Data Distribution Systems: A Distributed Infrastructure for Euclid and Associated Data

    Get PDF
    The Euclid Archive System is an ambitious information system, which sits at the heart of the Euclid Science Ground Segment. It is a joint development between the Euclid Consortium and the ESAC Science Data Centre. It encompases both Euclid data and the large volume of associated ground based data (e.g. KiDS, DES and LSST). The Euclid Science Ground Segment consists of the Euclid Science Operations Centre and ten national Science Data Centres. The large data volumes demand that data transfer is minimized and that the processing is taken to the data. This is supported by the Euclid Archive Data Processing System and the Euclid Archive Distributed Data System. The Data Processing System consists of a central metadata repository, which contains the information necessary to process any data item and full data lineage of any data product created. The Distributed Data System provides a cloud solution with a node at each of the national Science Data Centres, which controls data storage and transfer. It supports a large number of storage types, including POSIX, iRODS, gridftp and Xrootd. No limitations are placed on the storage implemented at an individual SDC. Further more, the user of the system needs no knowledge of where data is located. Jobs will be started at the most appropriate locations, or data transferred as necessary
    corecore